9 research outputs found
The Libra Toolkit for Probabilistic Models
The Libra Toolkit is a collection of algorithms for learning and inference
with discrete probabilistic models, including Bayesian networks, Markov
networks, dependency networks, and sum-product networks. Compared to other
toolkits, Libra places a greater emphasis on learning the structure of
tractable models in which exact inference is efficient. It also includes a
variety of algorithms for learning graphical models in which inference is
potentially intractable, and for performing exact and approximate inference.
Libra is released under a 2-clause BSD license to encourage broad use in
academia and industry
Differential Equation Units: Learning Functional Forms of Activation Functions from Data
Most deep neural networks use simple, fixed activation functions, such as
sigmoids or rectified linear units, regardless of domain or network structure.
We introduce differential equation units (DEUs), an improvement to modern
neural networks, which enables each neuron to learn a particular nonlinear
activation function from a family of solutions to an ordinary differential
equation. Specifically, each neuron may change its functional form during
training based on the behavior of the other parts of the network. We show that
using neurons with DEU activation functions results in a more compact network
capable of achieving comparable, if not superior, performance when is compared
to much larger networks.Comment: arXiv admin note: text overlap with arXiv:1905.0768
Learning Tractable Graphical Models
Probabilistic graphical models have been successfully applied to a wide variety of fields such as computer vision, natural language processing, robotics, and many more. However, for large scale problems represented using unrestricted probabilistic graphical models, exact inference is often intractable, which means that the model cannot compute the correct value of a joint probability query in a reasonable time. In general, approximate inference has been used to address this intractability, in which the exact joint probability is approximated. An increasingly popular alternative is tractable models. These models are constrained such that exact inference is efficient. To offer efficient exact inference, tractable models either benefit from graph-theoretic properties, such as bounded treewidth, or structural properties such as local structures, determinism, or symmetry. An appealing group of probabilistic models that capture local structures and determinism includes arithmetic circuits (ACs) and sum-product networks (SPNs), in which marginal and conditional queries can be answered efficiently.
In this dissertation, we describe ID-SPN, a state-of-the-art SPN learner as well as novel methods for learning tractable graphical models in a discriminative setting, in particular through introducing Generalized ACs, which combines ACs and neural networks.
Using extensive experiments, we show that the proposed methods often achieves better performance comparing to selected baselines. This dissertation includes previously published and unpublished co-authored material.10000-01-0
Discriminative Structure Learning of Arithmetic Circuits
The biggest limitation of probabilistic graphical models is the complexity of inference, which is often intractable. An appealing alternative is to use tractable probabilistic models, such as arithmetic circuits (ACs) and sum-product networks (SPNs), in which marginal and conditional queries can be answered efficiently. In this paper, we present the first discriminative structure learning algorithm for ACs, DACLearn (Discriminative AC Learner), which optimizes conditional log-likelihood. Based on our experiments, DACLearn learns models that are more accurate and compact than other tractable generative and discriminative baselines
Learning Markov Networks With Arithmetic Circuits
Markov networks are an effective way to represent complex probability distributions. However, learning their structure and parameters or using them to answer queries is typically intractable. One approach to making learning and inference tractable is to use approximations, such as pseudo-likelihood or approximate inference. An alternate approach is to use a restricted class of models where exact inference is always efficient. Previous work has explored low treewidth models, models with tree-structured features, and latent variable models. In this paper, we introduce ACMN, the first ever method for learning efficient Markov networks with arbitrary conjunctive features. The secret to ACMN’s greater flexibility is its use of arithmetic circuits, a lineartime inference representation that can handle many high treewidth models by exploiting local structure. ACMN uses the size of the corresponding arithmetic circuit as a learning bias, allowing it to trade off accuracy and inference complexity. In experiments on 12 standard datasets, the tractable models learned by ACMN are more accurate than both tractable models learned by other algorithms and approximate inference in intractable models.